Search CORE

6 research outputs found

Video Object Segmentation by Tracking Structured Key Points and Contours

Author: Caelles Prat Sergi
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2016
Field of study

In this thesis, we tackle the problem of video object segmentation where we have to classify every pixel of every frame in a video sequence into background and foreground classes. Our algorithms fall in the semi-supervised category, i.e., they start with the object of interest annotated in the first frame and then they track and segment that object in the following frames. The first algorithm that we have implemented describes the object of interest in terms of a set of points distributed on the object and then tracks them in the following frames. To make the tracking robust, we impose that the spatial distribution of these points is stable along the frames. To do so, we place a mesh on top of the mask of the object, whose vertices are the interest points to track, and the edges define the spatial structure within them. We then compute a descriptor of the appearance of each of the points and look for the displacements that bring those points in the following frame to a point with a similar descriptor. We enforce that the displacements of neighboring points are similar, which favors coherent deformations of the object. This algorithm may experience difficulties at the contours of the objects as the point descriptors might be influenced by the background. To overcome this problem, our second algorithm is based on the idea of tracking the contour of the object by imposing smooth deformations between frames. Starting from a polygonal representation of the contour of the object,we look for the locations at the following frame that have a strong response of an edge detector while minimizing the deformation of the shape. Specifically, we build a multiscale pyramid of segments of the contour polygon and look for the displacement of every segment that matches the edge response while being coherent with the rest of elements of the pyramid. This second algorithm can be understood as complementary to the first one, since it might fail in object with low-contrasted contours or with cluttered background. As an overall trade off, we propose a combination of the two algorithms that tries to make the most out of each of them and compensate their weaknesses. In order to validate our approaches, we perform an extensive validation on a recently-published database called DAVIS that provides fifty sequences with the ground truth annotated in each of their frames. We sweep all the different parameters of the algorithms in order to achieve the best performance in this database. The results show that the contour algorithm outperforms the mesh algorithm, so the weaknesses presented in the previous paragraph are more prominent in the mesh algorithm. Once we combine both of them, although we have not been able to do a full search in the parameter space, the results obtained are promising and an increase in the parameter space search suggests that we would outperform any of the standalone methods. We also perform a comparison against six state-of-the-art algorithms which shows that although we are still behind the better-performing ones, our approach might be competitive with further tuning and experimentation

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Implementation of DSP algorithms in VHDL for high-speed optical communications

Author: Caelles Prat Sergi
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/07/2014
Field of study

[ANGLÈS] The amount of traffic that backbone networks have to handle increases every year at about 30 to 60 %. A solution that has been used for the last two decades is the wavelengthdivision multiplexing (WDM) technique and more recently incorporating complex constellations in each wavelength, but there is not a lot of room for improvement in this systems. The main reason is because current experiments are close to the nonlinear Shannon limit for single-mode fibre (SMF), a problem known as the capacity crunch. Researchers have to find others ways to keep increasing the bandwidths to meet the growth in the traffic demands. The approach where researchers are squeezing their minds is the space-division multiplexing (SDM). Among other problems that the researchers face in this new technology, there is the crosstalk among parallel SDM. The current approach to overcome this problem is using multiple-input-multiple-output (MIMO) techniques in the digital signal processing (DSP) part of the system. Actually, the DSP has been used since the development of coherent systems in 2010 made it possible, e.g., in the elimination of the crosstalk among different polarizations when we are transmitting using polarization division multiplexing (PDM). In this thesis, which has been developed during the first four months of a seven months internship in Bell Labs located at Crawford Hill (USA), different algorithms that will be placed in the DSP part are presented. The main feature of the different designs is the possibility of using them in real time, therefore, they are implemented in VHDL to be able to be run in an FPGA. The algorithms that have been developed are a digital phase shifter in order to interpolate digital signals, a 64-point discrete Fourier transform and a frequency domain equalizer. These are the first steps for the final implementation of a complete real time MIMO receiver. This will be an important achievement as currently there are no publications that use a real time MIMO receiver in the DSP part of an optical system. All the different implementations have been tested with satisfactory results. However, some more optimizations will have to be done in the future to squeeze the complete MIMO receiver in a relatively large FPGA.[CASTELLÀ] El tráfico que tienen que soportar las redes troncales aumenta cada año entre un 30 y un 60 %. La solución actual y que se ha estado utilizando en las dos últimas décadas es la “wavelength-division multiplexed” (WDM) a la cual recientemente se ha incorporado constelaciones complejas en cada una de la longitudes de onda, pero desgraciadamente no se puede continuar en esta dirección. El principal motivo es que los experimentos actuales están muy próximos al límite no lineal de Shannon para fibras ópticas mono modo, este problema se conoce como el “capacity crunch”. Los investigadores deben encontrar un nuevo modo de incrementar la capacidad para poder compensar el incremento de tráfico en las redes. La solución en la cual están centrando sus esfuerzos se denomina “spatial-division multiplexing” (SDM). Entre los problemas que se han de hacer frente con esta nueva tecnología, está la diafonía entre los SDM paralelos. La solución actual a este problema es el uso de la técnica “multipleinput- multiple-output” (MIMO) en la parte del procesado de señal (DSP) del receptor. De hecho, el DSP se ha utilizado desde la aparición de los receptores coherentes en el año 2010, por ejemplo, se utiliza para eliminar la diafonía que existe entre las diversas polarizaciones cuando estamos transmitiendo con “polarization division multiplexing” (PDM). En este trabajo de fin de grado, el cual ha sido desarrollado durante los cuatro primeros meses de los siete que durará la estancia en los Bells Labs situados en Crawford Hill (USA), diversos algoritmos que funcionan en la parte de DSP serán presentados. La principal característica de los varios diseños es la posibilidad de poder ser implementados en tiempo real, para poder hacerlo, estarán implementados en VHDL para poder hacerlos funcionar en una FPGA. Los principales algoritmos que se han de desarrollado son un “digital phase shifter” para poder interpolar la señal digital, una implementación de 64 muestras de la transformada discreta de Fourier y un ecualizador en frecuencia. Estos son los primeros pasos para poder alcanzar el objetivo final que es la implementación del receptor MIMO completo. Cabe destacar la importancia de esta implementación, ya que, actualmente en ninguna publicación se usa en la parte de DSP de un sistema de fibra óptica un receptor MIMO que funcione en tiempo real. Los algoritmos se han testeado con resultados satisfactorios. A pesar de ello, se tendrán que realizar más optimizaciones para poder meter todo el receptor MIMO en una FPGA.[CATALÀ] El tràfic que han de suportar les xarxes troncals s’incrementa cada any entre un 30 i un 60 %. La solució actual i que s’ha estat fent servir en les últimes dues dècades és la “wavelength-division multiplexed” (WDM) a la qual recentment s’hi ha incorporat constel•lacions complexes a cada longitud d’ona, però malauradament no hi ha molt marge de millora en aquesta direcció. El principal motiu és que els experiments actuals estan molt propers al límit no lineal de Shannon per fibres òptiques mono mode, aquest problema és coneix com el “capacity crunch”. Els investigadors han de trobar noves maneres d’incrementar la capacita per poder assolir l’increment de tràfic a les xarxes. La solució en la qual s’ estan centrant les seves forces s’anomena “spatial-division multiplexing” (SDM). Entre molts altres problemes que s’han de fer front amb aquesta nova tecnologia , hi ha la diafonia entre els SDM paral•lels. La solució actual a aquest problema és l’ús de la tècnica “multiple-input-multiple-output” (MIMO) a la part de processat del senyal (DSP) del receptor. De fet, el DSP s’ha fet servir des de l’aparició del receptors coherents a l’any 2010, per exemple, es fa servir per eliminar la diafonia que hi ha entre les diverses polaritzacions quan estem transmetent amb “polarization division multiplexing” (PDM). En aquesta treball fi de grau, el qual ha estat desenvolupat durant els quatre primers mesos dels set en que consisteix l’estada als Bell Labs situats a Crawford Hill (USA), diversos algoritmes que funcionaran a la part de DSP seran presentats. La principal característica dels diversos dissenys és la possibilitat de poder ser implementats en temps real, per aquest fet, estaran implementats en VHDL per a poder fer-los funcionar en una FPGA. Els principals algoritmes que s’han desenvolupat són un “digital phase shifter” per a poder interpolar senyals digitals, una implementació de 64 mostres de la transformada discreta de Fourier i un equalitzador en freqüència. Aquets són els primers passos per a poder assolir l’objectiu final, que és la implementació d’un receptor MIMO complet. Aquest és un fet rellevant, ja que, actualment en cap publicació es fa servir a la part de DSP d’un sistema de fibra òptica un receptor MIMO que funcioni en temps real. Tots els algoritmes han estat testejats amb resultats satisfactoris. Tot i així, s’hauran de realitzar més optimitzacions per fer encabir el receptor MIMO complert en una FPGA

UPCommons. Portal del coneixement obert de la UPC

Video Object Segmentation: Methods and Datasets

Author: Caelles Prat Sergi
Publication venue: ETH Zurich
Publication date: 01/01/2020
Field of study

Video Object Segmentation (VOS) has been one of the several tasks where deep learning has brought enormous performance gains. This task consists in segmenting the objects in a video, that is, grouping together pixels that belong to the same object, both in space (within a frame) and in time (across different frames). Sergi's dissertation focuses on two important aspects to advance the progress in the topic: it proposes new VOS methods that advance the state of the art and it releases new datasets and benchmarks on which such techniques can be trained and compared

Repository for Publications and Research Data

Implementation of DSP algorithms in VHDL for high-speed optical communications

Author: Caelles Prat Sergi
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/07/2014
Field of study

Video Object Segmentation by Tracking Structured Key Points and Contours

Author: Caelles Prat Sergi
Publication venue: Universitat Politècnica de Catalunya
Publication date
Field of study

RECERCAT

Implementation of DSP algorithms in VHDL for high-speed optical communications

Author: Caelles Prat Sergi
Publication venue: Universitat Politècnica de Catalunya
Publication date
Field of study

RECERCAT